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Abstract — The SPS-LASSO has recently been introduced as a 
solution to the problem of regularization parameter selection in 
the complex-valued LASSO problem. Still, the dependence on the 
grid size and the polynomial time of performing convex optimiza- 
tion technique in each iteration, in addition to the deflciencies in 
the low noise regime, confincs its performance for Direction of 
Arrival (DOA) estimation. This work presents methods to apply 
LASSO without grid size limitation and with less complexity. 
As we show by simulations, the proposed methods loose a 
negligible performance compared to the Maximum Likelihood 
(ML) estimator, which needs a combinatorial search We also 
show by simulations that compared to practical implementations 
of ML, the proposed techniques are less sensitive to the source 
power difference. 

I. INTRODUCTION 

The Least Absolute Shrinkage and Selection Operator 
(LASSO) is a method of choosing a small set of bases among 
a large collection, best representing a set of data linearly. It is 
based on l\ regularization of an Ordinary Least Square (OLS), 
which always gives a sparse solution based on the conic nature 
of the cost function [1 1. The consistency of such a technique, 
under the assumption of the "best collection", is well discussed 
in an asymptotic case that both the data dimension and the size 
of collection increase to infinity J2|. 

A much different attempt has been made by applying 
LASSO into a finite dimensional data set with an asymp- 
totically large set of basis dictated by a physical model. A 
well known example is in [3|, where the LASSO technique is 
applied to the problem of estimating Direction Of Arrivals 
(DOAs) which can be expressed and solved in a linear 
regression fashion by discretizing the DOA parameter space. 
This method gained more attention since the solution could be 
found robustly, independent of the choice of the initial values 
in the numerical optimization technique due to the convex 
nature of the cost function. 

Unlike the statistical regression application of LASSO, the 
choice of the Regularization Parameter (RP) is critical in the 
current application. This parameter implements the trade off 
between model precision and model order. There are numerous 
ways to estimate the RP when the true number of sources is 
unknown H, |5). However, the estimation quality is always 
improved by choosing a smaller RP value with the same model 
order. On the other hand, finding the smallest such parameter 
is not straightforward, since the implementation of LASSO by 
convex programming techniques |6) is computationally costly, 
and can not be performed for a fine search over possible 
RP values. A stagewise solution is found in |7) for the real 
regression case, by the observing that the homotopy path of 



LASSO solutions is piecewise linear JS). Later, in f9| the idea 
was generalized to the complex problems by introducing a 
different LASSO optimality condition which is called Singular 
Point Selection (SPS)-LASSO. Decreasingly in RP, the SPS- 
LASSO follows the points in the homotopy path, in which a 
new regressor is born. These points are known as the candidate 
points. 

The SPS-LASSO technique is not suitable for the DOA 
estimation problem, especially in low SNR cases, due to the 
fact that the solution support moves continuously between 
two singular points. Furthermore, the discretized nature of 
the problem introduces an additional quantization error. Ac- 
cordingly, in this work, we introduce an alternative recursive 
solution, by keeping only the estimate support at each iteration 
and modifying the optimality conditions to avoid a discretized 
space. The resulting algorithm does not have the initialization 
problems of NonLinear Least Squares (NLLS) and is faster 
than convex programming techniques. We further simplify the 
algorithm to get a faster solution. We show by simulations 
that the faster version is also convergent to the optimal point 
with very high probability. The results show that the proposed 
optimal point is very close to the ML global minimum point, 
so that we loose a negligible performance compared to ML. 
Note that ML is very costly to be implemented when the 
model order grows. We further compare LASSO to other 
approximations of ML such as Space alternating Generalized 
EM (SAGE (H). 

II. Problem statement 

This work concerns the problem of DOA estimation with 
an array of m sensors, receiving one snapshot of narrowband 
signals from n far sources. Due to the far-field model of the 
transmitting wave, the received signal pattern is mostly defined 
by phase shifts at each sensor so that the received signal vector 
x can be written as (|11|]) 



x = a (^) s i + n = A(0)s + n, 



(D 



where s = [si S2 ... s n ] T , O = [6i $2 ■ ■ ■ 6 n ], and n are 
the source waveform, DOA, and measurement noise vectors 
respectively. Furthermore, a((9) is the steering vector, which 
represents the phase shift operations at different sensors cor- 
responding to the DOA 8. The problem is to estimate the 
DOA vector 6 given the measurement vector x assuming an 
uncorrelated, circularly symmetric, centered Gaussian noise 
vector n. 



A. Conventional Solutions 

Under the above statistical assumptions of the noise vector, 
the deterministic ML estimator is given by ( ifTTI ) 



{8, s) = argmin ||x - A(0)s||2, 

(0,8) 



(2) 



where 9 G [0 ir] n . The solution of [2] could be found by not- 
ing its corresponding Karush-Kuhn-Tucker (KKT) conditions 

(E2) 

a H {9 l )h = 



V#i 6 9 : 



R(sjd ff (^)n) = 0. 



(3) 



where n = x A(9)s and d(0) = 9 *£P ■ The system (01 
does not have a unique solution although the ML global 
optimum is unique. However, starting from a sufficiently 
close point, and following a Newton recursive algorithm, the 
solution may converge to the global optimum. We introduce 
Algorithm Q] as a typical such solution in which we write 
the complex waveforms in polar coordinates as s,; = rte^ ai 
and Jml is given in where Ai and A 2 are diagonal 
matrices who's diagonal elements are given by d {6i)n 
and c H (9i)h with c(0) = id SP , respectively. Furtheremore, 
D(0) = [d(6>i) d(6> 2 )...d(e»„)] and S and T are diagonal 
matrices who's diagonal elements are Sj and e? ai respectively. 
More details of such methods could be found in ifTJl . 

Algorithm 1 ML estimator Newton solver 
9 <— 9o and s ■(— So. 
while not converging do 
n «— x — A(9)s 

VML i- [M(A H (9)h) T Z(A H (9)h) T K(T> H (9)n) T ] T 

Vo <- Jml^ml 
for i=l:n do 

n + r)o in+i 

a t <- ^ + r? ,2n+j 
end for 
end while 

The convergence could be checked simply by thresholding 
the difference in the sequence of estimates or the sequence 
of the cost function. Although the Algorithm Q] is naive in 
practice, due to the inversion instability of the Jacobian matrix 
J, we introduce it as a base of development toward our 
proposed method. As we have already explained, the direct 
ML realization is costly when the desired model order is high. 
Furtheremore, such method is not stable due to the . 

There has been a variety of approximate, low-cost solutions 
to ML such as Expectation-Maximization (EM) algorithms and 
Space Alternating Generalized EM (SAGE |flj))) as well as 
RELAX [14| all more or less dependent on the choice of 
the initial values. As another alternative, the possibility of 
estimating DOAs using t\ penalized Ordinary Least Squares 
(OLS) has been known and discussed for almost a decade 
(|3|). This method solves the problem of local minima by 
minimizing an approximate convex cost function which is 



independent of the initial values. Assume a discretization 
99 = {6{, 0f , ...,6%}of [0 tt]. According to 0, the solution 
of (O could be approximated by first solving 

s 9 = argmini||x - A*s»||| + A||s»[|i, (5) 

where A 9 = A(9 S ) and A is a regularization parameter |0J, 
and next introducing 9 as the elements in 9 9 with nonzero 
corresponding s 9 elements. 

The parameter A sets the compromise between the level of 
sparseness and model fit, and is usually hard to determine 
analytically. On the other hand, realizing LASSO in (O for a 
big set of A values is infeasible. In [9|, an alternative solution 
is introduced by the SPS-LASSO stagewise algorithm, which 
only needs LASSO realizations at a recursively determined 
finite set of candidate A values. This could be done by 
observing the following optimality conditions of LASSO [9| 

\a H (9 9 )n\ < A 

6f e I ^> a H {9)h = X^h ' (6) 



V0f e 9 9 



where / = {0? \s 9 ^ 0} and n = x A 9 s. Let us define the 
following function: 

1) Marginalized Source Attractor (G): Given the quadruple 
(J, r, a, A) and a point 6, the function G(I, r, a, cf>; 9) gives the 
convergence point (r' , a', A') of the Newton iterative solution 
of the following system of equations starting from (r, a, A) 
with Si — rieJ ai . 



i (9 l )ii' = \'e ja > 
|a ff (^)A'l = A'. 



(7) 



where n' = x — A(J)s'. The G function could also be 
computed faster using an iterative algorithm as in (9|, but we 
will not express it here. The algorithm could be written as 
Algorithm 12 where G\ denotes the A element of the function 
G. Note that it is assumed that between each two candidate 
points in the solution path of LASSO (as a function of A) the 
DOA estimate is constant. 

Algorithm 2 SPS LASSO 

#0 ^— argmax \& H (9 9 )x.\ 

I <- {9 Q } and A <- \& H (9 )x\ 

r «- and e^ a f- 4ft 

counter 0. 

while counter < n do 

9\ <— argmaxGA(/, r, a, A; 9 9 ) 

I I U 0i 

(r, a, A) 4- G(I, r, a, A; <M 

m <r- and e 1 "^ <- aJf(e ° )x 
re 1 <r- u anu e <r- | a H( fl ^ x | 

counter <— counter+1. 

end while 



The performance of SPS-LASSO is still limited by the 
size of the grid which may not be increased arbitrarily due 
to the complexity of convex optimization techniques |6l. 



R(Ai - A H (e)B(e)s) 

9f(Ai - A ff (0)D(0)S) 
3?(A 2 -D H (0)D(0)S) 

j = [ -D(0)S 



-K(A H (0)A(0)r) Q(A ff (0)A(0)S) 
-3(A ff (0)A(0)r) -5ft(A ff (0)A(0)S) 
-3(D H (0)A(0)r) -3?(D H (0)A(0)S) 

-A(e)r -jA(0)s ] 



(4) 



Furthermore, SPS-LASSO works incorrectly in very high SNR 
regimes, where the desired regularization parameter is ex- 
tremely small and the approximately fixed DOAs assumption 
does not hold. 

To overcome such difficulties, in this work we redefine the 
parameter space as the space of low dimensional non sparse 
vectors. However, inspired by the optimality conditions of 
LASSO, we introduce slightly modified feasible conditions 
which do not depend on any discretization. We demonstrate 
the details of two algorithms to find the unique optimal point 
of these new conditions. 

III. Continuous LASSO 

Inspired by SPS-LASSO, we look for a finite subset I = 
{81,62, ■■■ ,0 r } C [0 7r] and a complex function Si = s (8 i) 
on I satisfying 

where n = x A(/)s. We also write s = re 3 " where (r, a) 
are the polar coordinates of the function s. The existence and 
uniqueness of such set as well as consistency conditions of 
such method is proved but will not be presented in this work. 

Finding such a set is guaranteed by the homotopy rule 
and the similarities to convex LASSO based estimation. Note 
that finding the solution for a fixed A by a gradient descent 
algorithm is not possible due to the infinite dimensional nature 
of the problem. Thus, we assume that A is variable. Note that 

for the values of A > max|a H (#)x| the solution is given by 

9 

I = 0. Note that conditions ((HJ imply that each &i € I is 
a global maximum point for the function f (8) = \a. H (6)h\. 
Thus, its derivative is zero which after some manipulations 
results in the condition 



system of equations 



5t(s*d ff (pi)n) = 0. 



(9) 



The equation © and the second line of (HJ might be compared 
to the ML optimality conditions in OJ. The main idea in 
this work is to relax the assumption that DOAs are fixed in 
the smooth homotopy pieces. Roughly speaking, this means 
that the marginalized source attractor G should be modified 
to include DOA changes imposed by © as well as source 
changes. Accordingly, we modify G as follows: 

2) Local Marginalized Attractor (F): : Given a quadruple 
(I, r, a, A) and a point 8, the function F(I, r, <fi, A; 8) gives 
the convergence point quadruple (I', r', a', X') starting from 
(I, r, a, A) and following the Newton recursive solution of the 



n' = A' 



^(gd^)n')=Q 
|a"(0)n'| = A', 



(10) 



where n' = x — A (I') s'. Note that F is obtained from G 
by relaxing the DOA parameters and imposing (|9). The G 
function could be found using a Newton algorithm similar to 
Algorithm[T]by substituting Jml and tjml with Jlma and T7lma 
respectively where 



Vlma 



r]LEA 

\a H (8)n\ 2 - A 2 



(H) 



and 



'LEA 



5R(e J ' Ql ) 
3(e J ' Q2 ) 



(12) 



(sl h (6»)jn ff a(<9)) -2A 
with j given in (0|, 

t?lea - t7ml - X[^ a ) 3(e JQ ) 0] T , (13) 

and 



'LEA — JML 



o o A3?(r) 

-A5R(r) 

00 o 



(14) 



The proposed C-LASSO algorithm is summarized as [3] below. 

The proposed algorithm gives the precise optimal solution 
of (O which as we show by simulations is a proper robust 
approximation of the ML estimator. However, the searching 
steps might still be costly for certain fast applications. In this 
case, we introduce a different method of solving ^ which we 
call C-LASSO^. 

The idea in this modification is that the long jumps in the 
LASSO path by the attractor F could be substituted by a 
smoother development of A. Although it may take more steps 
to achieve the next singular point, the simplicity of compu- 
tations at each step compensates the complicated process at 
searching steps of C-LASSO. Note that the conditions in ([H]) 
imply that the spectrum f (6) = |a ff (0)n| is boundedby A and 



Algorithm 3 C-LASSO 



Algorithm 4 C-LASSO,,, 



#o argmax \a H (#)x 

e 

I f- {9 } and A +- |a ff (6»o)x| 

r ^ and e*« f- 

counter «— 0. 

while counter < n do 

$i f- argmax7\(7, r 7 A; 0) 

(7, r, a, A) <- F(J, r, a, A; f?i) 
7 +- 7 U 0i 

r fll <- and e*»(«0 <- 
counter <— counter+1. 
end while 



achieves this value at the DOA estimates. As a graphical view, 
assume the graph y = f (9) and the line y = A as the limit line 
touching the graph from above at DOA estimates. Decreasing 
A gradually, the limit line decreases and the graph gets more 
compressed. Roughly speaking, some points in the graph resist 
decreasing forming a peak which eventually touches the limit 
line. At this RP value a new DOA at the new touching point 
is introduced, which later follows decreasing with the limiting 
line. Thus, we can check our distance to the next singular 
point when decreasing A by looking at the highest peak point 
different to the estimates. If this peak is smaller than A we can 
continue our pass with a local change of parameters. However, 
at the next singular point the new touching point should be 
added to the active solutions. Otherwise, the extra peak will 
be more than A which indicates a wrong solution and a need 
to increase A. Accordingly, we first introduce the following 
function: 

3) Local Eguilevel Attractor (H): Given a quadruple 
(I, r, a, A), the function H gives the convergence point 
(I', r', a') of the Newton iterative solution of the system of 
equations given by the second line of © and (|9), i. e. 

a"(00n' = A^ 

K(e-i«i d H {9^)n') = 0. (15) 

The function H can be computed fast using the Newton algo- 
rithm similar to the ML solution in Algorithm [TJ substituting 
Jml and t]ml by Jlea and tjlea in (O and (TT~3T > respectively. 
The C-LASSO/j algorithm can then be expressed as in Al- 
gorithm |4] where [i is a parameter setting the compromise 
between the speed and the probability of convergence. 

IV. NUMERICAL RESULTS 

Assuming a Standard Uniform Linear Array , we compared 
the result of applying C-LASSO to the single snapshot DOA 
model with the ML and RELAX estimator. The RELAX 
method alternates between estimating different subsets of 
DOAs using the previous estimate of the complement subset. 
In this work we use a singleton subset at each iteration to 
estimate which is often the case in practice. The ML estimator 
is implemented by first an exhaustive search and then the local 



#o <— argmax \ & H (0)x| 

e 

I f- {6 } and A f- |a ff (f? )x| 
r^Oand e** <- ^gaj* 
counter +- 0. 
while counter < n do 
n <r- x A(7)s 

6\ <— argmax \& H (9)h\ s. t. 6 ^ I is a local maximum. 

e 

p i- \a H (9i)h\ 
if p = A then 

7 <- I U B x 

r a <- and e ja ^ 4- a „ (0l)x 
r$ 1 i— u anu e <r- | a « (e , l)x | 

counter counter+1. 

else 

A <- /xA + (1 - n)p 
{I, r, a) f- H(I,r,a,X) 
end if 
end while 
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Fig. 1. Comparing MSE Vs SNR for 100 trials of single-snapshot data 
generated from a half wavelength ULA of m=15 sensors and 2 sources 
separated by the electrical angle — . 



Newton method solving (O. Figure [T| shows the estimation 
Mean Squared Error (MSE) for these three techniques in 
the case of two sources with s± = s% = 1 and separation 
Acj) = —, when the number of sensors m is 15. As expected, 
LASSO has an estimation error level higher than MSE due to 
its bias. The RELAX estimate gives closer solution in this 
case. However, the C-LASSO reaches the threshold region 
at slightly lower SNR compared to RELAX. Figure [2] shows 
the error result of LASSO compared to RELAX fixing SNR 
to 10 dB and varying the separation in terms of electrical 
angle in the previous scenario. It shows a typical behavior of 
LASSO in the high SNR case, in which very close sources 
are absorbed to one source so that the LASSO solution path 
does not contain any point with correct model order. This is 
shown as the "undefined region" in Figure [2] However, the 
LASSO solution shows instabilities until separation reaches 
the fundamental resolution fl5l . 

In another experiment, we compared the C-LASSOh, and 
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Fig. 2. Comparing MSE Vs separation in electrical angles for 100 trials of 
single-snapshot data generated from a half wavelength ULA of m=15 sensors. 
SNR=10 dB. 
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Fig. 3. C-LASSO performance compared to RELAX. The square, triangle, 
and circle markers show the DOA estimates of the first, second and third 
source respectively while the dashed lines show the result for RELAX. 

RELAX performance in a more complicated case of three 
close sources at electrical angles 2=22£1 with si = so = 
1 and S3 = 0.1 j. The SNR is measured by observing s\ and n\ 
when /i = 0.8. The scenario is hard for the RELAX algorithm 
in the one snapshot case due to the different levels of sources. 
The results are shown in Figure[3] As can be seen, the RELAX 
technique can only resolve the reference signal si, while too 
wide dynamic range makes it hard to estimate the DOAs of 
the second and the third sources respectively. However, one 
may note the higher threshold SNR of C-LASSO^. Note also 
that the realization of C-LASSO^ took 60 times longer time 
than RELAX using MATLAB programming. 

V. CONCLUSION 

In this work, we introduced a LASSO realization technique, 
C-LASSO, in the one snapshot case and an improvement C- 
LASSOh from the complexity point of view which do not 
need any discretization. They are much faster than the convex 
programming realizations, such as the interior point technique 
considering the problem of RP selection. These algorithms 
are based on the idea of following the optimality point path 
of a set of generalized optimality conditions derived from the 
LASSO original ones. 

The results show that in lower SNRs and more sources case, 
C-LASSO dominates ML from complexity point of view with- 



out loosing much performance. Although there exists other 
DOA estimation techniques approximating the ML solution, 
we showed by simulation that the LASSO algorithm is more 
robust to the problem of wide dynamic range . It can be shown, 
while neglected in this work, that the LASSO bias is linear 
with the noise level independent of the true DOAs and sources. 

The C-LASSO algorithms drawback is its computational 
time due to one dimensional search steps and the parameter 
\x in C-LASSO/j, which could not be decreased arbitrarily 
due to the convergence problem. However, sacrificing some 
performance, one may confine the search to a sufficiently 
fine grid neglecting the fine tuning step which can speed up 
the algorithm. Furthermore, while we have not presented the 
mathematical details here, LASSO encounters a consistency 
problem in the one snapshot case of the sources separated 
less than a fundamental resolution (see lfl5l for a similar 
argument). 

Finally, it should be noted that similar to the Group- 
LASSO (G-LASSO) formalism, C-LASSO could be adapted 
to multiple-snapshot model. It is expected that due to the good 
performance of LASSO in high SNRs, the grouped C-LASSO 
also provides a robust technique. 
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